Validation of bone mineral density measurement using quantitative CBCT image based on deep learning

The bone mineral density (BMD) measurement is a direct method of estimating human bone mass for diagnosing osteoporosis, and performed to objectively evaluate bone quality before implant surgery in dental clinics. The objective of this study was to validate the accuracy and reliability of BMD measurements made using quantitative cone-beam CT (CBCT) image based on deep learning by applying the method to clinical data from actual patients. Datasets containing 7500 pairs of CT and CBCT axial slice images from 30 patients were used to train a previously developed deep-learning model (QCBCT-NET). We selected 36 volumes of interest in the CBCT images for each patient in the bone regions of potential implants sites on the maxilla and mandible. We compared the BMDs shown in the quantitative CBCT (QCBCT) images with those in the conventional CBCT (CAL_CBCT) images at the various bone sites of interest across the entire field of view (FOV) using the performance metrics of the MAE, RMSE, MAPE (mean absolute percentage error), R2 (coefficient of determination), and SEE (standard error of estimation). Compared with the ground truth (QCT) images, the accuracy of the BMD measurements from the QCBCT images showed an RMSE of 83.41 mg/cm3, MAE of 67.94 mg/cm3, and MAPE of 8.32% across all the bone sites of interest, whereas for the CAL_CBCT images, those values were 491.15 mg/cm3, 460.52 mg/cm3, and 54.29%, respectively. The linear regression between the QCBCT and QCT images showed a slope of 1.00 and a R2 of 0.85, whereas for the CAL_CBCT images, those values were 0.32 and 0.24, respectively. The overall SEE between the QCBCT images and QCT images was 81.06 mg/cm3, whereas the SEE for the CAL_CBCT images was 109.32 mg/cm3. The QCBCT images thus showed better accuracy, linearity, and uniformity than the CAL_CBCT images across the entire FOV. The BMD measurements from the quantitative CBCT images showed high accuracy, linearity, and uniformity regardless of the relative geometric positions of the bone in the potential implant site. When applied to actual patient CBCT images, the CBCT-based quantitative BMD measurement based on deep learning demonstrated high accuracy and reliability across the entire FOV.

In addition, we also obtained MDCT and CBCT images of a BMD calibration phantom (QRM-BDC Phantom 200 mm length, QRM GmbH, Moehrendorf, Germany) with calcium hydroxyapatite inserts of three densities (0 (water), 100, and 200 mg/cm 3 ) under the same conditions (Fig. 1). The MDCT images of the patients were converted into quantitative CT (QCT) images by linearly calibrating the HU in the patient data with the HU in the MDCT images of the BMD calibration phantom. The CBCT images of the patients were also converted into calibrated CBCT (CAL_CBCT) images using the corresponding images of the BMD calibration phantom for comparison. The CAL_CBCT images were afterwards used for comparisons with the results of deep learning (described below) 77 . The MDCT and corresponding CBCT images were matched by paired-point registration using software (3D Slicer, MIT, Massachusetts, US) with manually localized landmarks for deep learning 77 . We prepared 4500 pairs of axial slice QCT and CBCT images from 18 patients for the training and validation datasets, and then we independently prepared another 3000 pairs of QCT and CBCT images from 12 patients for the test dataset.
QCBCT-NET for quantitative CBCT images. In a previous study, we developed a deep-learning model (QCBCT-NET) composed of Cycle-GAN with residual blocks and multi-channel U-Net to generate QCT-like images from conventional CBCT images (Fig. 2) 77 . The Cycle-GAN consisted of two generators, one to map CBCT images to QCT images (G CBCT→QCT ) and one to map QCT images to CBCT images (G QCT→CBCT ), and two discriminators to distinguish between real (D QCT ) and generated (D CBCT ) images 66,77 . We used a ResNet architecture with nine residual blocks for the generators and PatchGAN for the discriminators 77 . To generate QCBCT images, we constructed a multi-channel U-Net by combining two channel inputs, the original CBCT image and the corresponding output of the Cycle-GAN 77 . The QCBCT-NET was trained with the Adam optimizer using a mini-batch size of 4, an epoch length of 100, and a learning rate of 0.0002. The learning rate was set to 0.0001, and momentum terms were assigned a value of 0.9 to stabilize the training 77 .
Comparison of BMD measurements from QCBCT and CAL_CBCT images. We measured BMDs using QCBCT and CAL_CBCT images at the bone regions of potential implant sites on the maxilla and mandible in the same axial plane and between axial slices across the entire FOV of the CBCT images to simulate a preoperative BMD evaluation. We selected 36 cubic volumes of interest (VOIs) of 5 × 5 × 5 voxels on the maxilla and mandible for each patient. The center points of the VOIs were localized at the trabecular bones of the six interdental bones of the left and right incisors, premolars, and molars in six slices from the coronal, middle, and apical regions of the tooth on both the maxilla and mandible (Fig. 3). The mean BMD of the VOI voxels was used as the representative BMD at the bone region of the potential implant site.
To analyze the accuracy of the BMD measurements made from QCBCT and CAL_CBCT images at various bone sites of interest in the axial plane and between axial slices across the entire FOV, the VOIs at the different bone sites were categorized into five groups (Fig. 3). With regard to the accuracy of the measurements at bone sites between axial slices, the VOIs were categorized in three groups as follows: group A, the maxillary bone and the mandibular bone; group B, the maxillary apical bone, the maxillary middle bone, and the maxillary coronal bone; and group C, the mandibular coronal bone, the mandibular middle bone, and the mandibular apical bone. With regard to the accuracy of the measurements at bone sites in the same axial plane, the VOIs were categorized in two groups as follows: group D, the maxillary incisor bone, the maxillary premolar bone, and the maxillary molar bone; and group E, the mandibular incisor bone, the mandibular premolar bone, and the mandibular molar bone (Fig. 3).
To compare accuracy of BMD measurements between QCBCT and CAL_CBCT images at various bone sites of interest across the entire FOV, we calculated the mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), correlation coefficient (r), coefficient of determination (R 2 ), and standard error of estimation (SEE) between the BMD values from the experimental and QCT (ground truth) images. Paired t-tests were performed on the MAE and MAPE values to determine whether the QCBCT and CAL_CBCT results differed with statistical significance. To evaluate the linearity of the BMD measurements from QCBCT and CAL_CBCT images at various bone sites of interest, a linear regression analysis was performed between the www.nature.com/scientificreports/ experimental and QCT images, and a Bland-Altman analysis was also performed to compare the agreement between the experimental and QCT results 78 .
To compare accuracy of BMD measurements between different bone sites of interest in the axial plane and between axial slices across the entire FOV, one-way ANOVA tests were performed among MAPEs at the different bone sites for both QCBCT and CAL_CBCT images. When statistical significance was shown in the ANOVA Figure 2. The QCBCT-NET architecture combining Cycle-GAN and the multi-channel U-net 77 . The Cycle-GAN consisted of two generators of G CBCT⟶QCT , and G QCT⟶CBCT , and two discriminators of D CBCT , and D QCT . The multi-channel U-Net had two-channel inputs of CBCT and corresponding CYC_CBCT images, consisting of 3 × 3 convolution layers with batch normalization and ReLU activation, and had skip connections at each layer level. Max-pooling was used for down-sampling and transposed convolution was used for up-sampling. Consequently, the QCBCT-NET generated QCBCT images from CBCT images to quantitatively measure BMD in CBCTs. www.nature.com/scientificreports/ test, the statistically significant differences and relations were identified using a post-hoc analysis (Tukey's HSD test). All tests were performed with a significance level of 0.01 (SPSS v26, SPSS Inc., Chicago, IL, USA). Figure 4 shows axial-slice QCT, QCBCT, and CAL_CBCT images used to calculate BMD at the maxilla and mandible. Compared with the original QCT images, the QCBCT image quality in both regions demonstrated a substantial improvement over that in the CAL_CBCT images in terms of BMD determination. Compared with the CAL_CBCT images, the QCBCT images showed considerably decreased disparities around the teeth and areas of higher BMD. Because complete matching between the CT and CBCT images was not possible because of head positional differences in the patients between the CT and CBCT scans, large differences were observed in the airways, spine, and soft tissues in both the QCBCT and CAL_CBCT images (Fig. 4). Figure 5 shows the BMD profiles that were acquired along the dental arch at the maxilla and mandible in the QCT and experimental images shown in Fig. 4. Compared with that from the CAL_CBCT images, the BMD profile from the QCBCT images more closely resembles and better correlates with that in the original QCT images. Therefore, the QCBCT images show more similarity to the QCT images than the CAL_CBCT images at both the maxilla and mandible.   Table 1 show the accuracy (MAE, RMSE, and MAPE) of the BMD measurements made using the QCBCT and CAL_CBCT images, as compared with the QCT images at various bone sites of interest. The BMD values made using QCBCT measurements showed significantly lower MAE and MAPE values than those made using CAL_CBCT images for all bone sites across the entire FOV (p < 0.01). The mean RMSE of the QCBCT images across all sites was 83.41 mg/cm 3 , whereas that of the CAL_CBCT images was 491.15 mg/cm 3 . The MAE of the QCBCT images showed an overall mean of 67.94 mg/cm 3 , ranging from 63.66 at the mandible to 72.21 at the maxilla, whereas the MAE for the CAL_CBCT images had an overall mean of 460.52 mg/cm 3 , ranging from 444.32 at the maxilla to 476.73 at the mandible. The MAPE of the QCBCT images showed an overall mean of 8.32%, ranging from 7.89 at the mandible to 8.76 at the maxilla, whereas the MAPE of the CAL_CBCT images had an overall mean of 54.29%, ranging from 52.27 at the maxilla to 56.31 at the mandible. Figure 6 shows that the BMD distribution from the QCBCT images was closer to that in the original QCT images than that in the CAL_CBCT images for both the maxilla and mandible. Therefore, the BMD measurements made using the QCBCT images were more accurate than those made using the CAL_CBCT images, regardless of the bone site of interest or anatomical structures of the maxilla and mandible across the entire FOV. Table 2 presents the results (r, R 2 , and SEE) from the linearity analysis comparing the BMD values from the QCBCT and CAL_CBCT images with those from the QCT (ground truth) images at various bone sites of interest. The QCBCT images showed closer linear relationships with the QCT results than the CAL_CBCT images at various bone sites across the entire FOV. The linear regression between the QCBCT and QCT images showed an overall slope of 1.00, ranging from 0.93 at the mandible to 1.08 at the maxilla, whereas the regression between the CAL_CBCT and QCT images had an overall slope of 0.32, ranging from 0.29 at the mandible to 0.35 at the  www.nature.com/scientificreports/ maxilla. The overall coefficient of determination for the BMD values from the QCBCT images compared with the QCT images was 0.85, whereas that for the CAL_CBCT images was 0.24. The overall correlation coefficient for the BMD values from the QCBCT images compared those from the QCT images was 0.92, whereas that for the CAL_CBCT images was 0.49. With the larger slope and better goodness of fit, the linear relationship between the QCBCT and QCT images demonstrated greater contrast and correlation than that between the CAL_CBCT and QCT images at the different bone sites (Fig. 7). The Bland-Altman plot between the QCBCT and QCT images also demonstrated a more linear relationship and better agreement limits than the plot between the CAL_CBCT and QCT images at the different bone sites (Fig. 8). The mean difference between the QCT and QCBCT images was small, − 19.65 mg/cm 3 , with a relatively uniform distribution around the mean difference value, whereas that between the CAL_CBCT and QCT images was large, 460.52 mg/cm 3 , with an abnormal pattern of an upward slope (Fig. 8). The 95% limits of agreement for    Fig. 8). Furthermore, the overall SEE for the BMD values from the QCBCT images (compared with the QCT images) was 81.06 mg/cm 3 , whereas that for the CAL_CBCT images was 109.32 mg/cm 3 . Generally, the SEE for the QCBCT images was smaller than that for the CAL_CBCT images at all bone sites of interest, indicating that the QCBCT images enabled BMD measurements with higher uniformity (Table 2). Therefore, the BMD measurements made using the QCBCT images demonstrated more linear relationships, greater contrast, and higher uniformity with the QCT images than those made using the CAL_CBCT images at the various bone sites across the entire FOV. The results in Table 3 indicate that the MAPE of the QCBCT images did not differ significantly among the bone sites in all groups (p > 0.01), whereas the MAPE of the CAL_CBCT images did differ significantly among bone sites in all groups (p < 0.01). In the BMD measurements at bone sites between axial slices, the MAPE  www.nature.com/scientificreports/ of the CAL_CBCT images increased in groups A (Maxilla < Mandible), B (Mx-Coronal < Mx-Apical), and C (Mn-Middle, Mn-Coronal < Mn-Apical), as the distance from the center of the FOV increased in the z-axis (p < 0.01) ( Table 3). In the BMD measurements at bone sites in the axial plane, the MAPE of the CAL_CBCT images increased in groups D (Mx-Molar < Mx-Incisor) and E (Mn-Molar < Mn-Incisor) as the distance from the center of the FOV increased in the xy-plane (p < 0.01) ( Table 3). The center of the FOV in all the CBCT images was between the middle and the coronal site of the maxillary tooth in the z-axis, and in the soft palate region behind the molars in the xy-plane (Fig. 3). On the other hand, the MAPE of the QCBCT images did not differ significantly among bone sites in the axial plane or between axial slices (p > 0.01). Furthermore, the BMD measurements made using the QCBCT images show higher MAPE performances than those made using the CAL_CBCT images, with a smaller dispersion of data and shorter whisker lengths in the boxplots across all bone sites of interest (Fig. 9). Therefore, the BMD measurements made using the QCBCT images demonstrated better reliability than those made using the CAL_CBCT images without regard to bone sites in the axial plane or between axial slices across the entire FOV.

Discussion
The BMD measurement is a direct method of estimating human bone mass for diagnosing osteoporosis, and performed to objectively evaluate bone quality before implant surgery in dental clinics. The objective of this study was to validate the accuracy and reliability of BMD measurements made using quantitative cone-beam CT (CBCT) image from actual patients based on deep learning. In a previous study, we measured BMD directly and quantitatively from CBCT images using a developed deep-learning model (QCBCT-NET) and human skull phantoms 77 . The QCBCT-NET greatly increased the linearity and uniformity of CBCT images, compared with the Cycle-GAN and U-Net deep learning models, which we demonstrated by showing improvements in quantitative performance 77 . In this study, we evaluated the accuracy and reliability of BMD measurements from QCBCT images created by applying QCBCT-NET to clinical data from actual patients. We analyzed and compared the accuracy of BMD measurements between CAL_CBCT and QCBCT images from various bone sites of interest across the entire FOV. The QCBCT images showed better accuracy and uniformity and improved linearity and contrast compared with the CAL_CBCT images across the entire FOV. The QCBCT images had an RMSE of 83.41 mg/cm 3 , an MAE of 67.94 mg/cm 3 , and a MAPE of 8.32% for all bone sites of interest across the entire FOV, whereas the CAL_CBCT images had an RMSE of 491.15 mg/cm 3 , an MAE of 460.52 mg/cm 3 , and a MAPE of 54.29%. The linear regression between the QCBCT and QCT images showed a slope of 1.00 and a coefficient of determination of 0.85, whereas those values with the CAL_CBCT images were 0.32, and 0.24, respectively. The correlation coefficient for the BMD values from the QCBCT images, compared with the QCT images, was 0.92, whereas that from the CAL_CBCT image was 0.49. The overall SEE between the QCBCT images and the QCT images was 81.06 mg/cm 3 , whereas that for the CAL_CBCT images was 109.32 mg/ cm 3 . Therefore, the BMD measurements made using the QCBCT images had high accuracy and reliability, regardless of the relative geometric positions of the bone across the entire FOV of the CBCT.
The results from the Bland-Altman analysis indicate that compared with the CAL_CBCT images, QCBCT images had a smaller MAE with the QCT images and showed a relatively uniform distribution around the mean difference values in all BMD ranges, indicating better agreement. The CAL_CBCT images showed a significantly upward slope pattern in the Bland-Altman plot, whereas the QCBCT showed a slight downward slope pattern. As the mean BMD values from the QCT and QCBCT images increased, it appears that the size of the error increased proportionately in the plot. Therefore, it would be more beneficial to use the MAPE, a scale-independent accuracy metric, instead of the MAE when comparing the accuracy of BMD measurements between bone sites with different BMD values across the entire FOV. www.nature.com/scientificreports/ To evaluate differences in BMD measurement accuracy between various bone sites in the axial plane and between axial slices across the entire FOV of CBCT, we compared the MAPEs at bone sites of interest according to the categorized groups. The MAPEs of the QCBCT images did not differ significantly among bone sites in any group across the entire FOV; however, the CAL_CBCT images showed significant differences in all groups. In the CAL_CBCT images, the accuracy at bone sites of interest in the axial plane and between axial slices decreased as the site of interest became farther from the center of the FOV. Between axial slices, the cone-shaped beam geometry of CBCT means that the incident beam angle is not parallel as the target becomes farther away from the center of the FOV, which causes a decrease in accuracy and an increase in the variability of BMD measurements 42 . Pauwels et al. argued that image quality degradation and artifacts can occur more in the top and bottom parts of the FOV due to the wide cone angle in some CBCT images 40 , which is consistent with our findings for the CAL_CBCT images. In the axial plane, the beam geometry of CBCT causes asymmetry in the X-ray path, leading to position-dependent differences in beam hardening and endo/exo-mass effects and a consequent difference in BMD accuracy between the central and peripheral parts of the FOV 79-81 . Plachtovics et al. showed that the difference between the HU of MDCT and the GV of CBCT increased from the center of the FOV to the outer border within an axial slice, resulting in a decrease in the density measurement accuracy of CBCT, which is consistent with our findings for the CAL_CBCT images 82 .
The significant difference in MAPE values between the bone sites in the CAL_CBCT images, both within the axial plane and between axial slices, suggests the nonlinearity and non-uniformity caused by the cone beam geometry of CBCT. In contrast, no significant differences in MAPE values were found among bone sites across the entire FOV in the QCBCT images, which indicates that the inherent nonlinearity and non-uniformity of CBCT was reduced, which improved the resulting BMD measurements. Therefore, the BMD measurements made using the QCBCT images demonstrated high reliability without regard to the relative geometric positions of the bone sites across the entire FOV. On the other hand, the MAPE of the QCBCT images showed larger values in the premolar and molar bones than the incisor bone at both the maxilla and mandible, but they did not showed significant differences. As a result, the artifacts of metallic or resin restorations more common in the premolars and molars might have the impact of decreasing the reliability on the quality of the QCBCT images in patients.
QCBCT images will bring great progress in measuring bone density and bone mass using CBCT in dental clinical practice and enable the accurate, quantitative evaluation of bone density using only CBCT equipment. We have shown that it can be applied to actual patient clinical data for use in dental implant treatment. It can also help to more accurately estimate the dimension of residual bone and cortical bone thickness during the preoperative evaluation. An accurate preoperative evaluation of bone quality and quantity at the site for implant placement can help to predict the primary implant stability and determine whether implant placement is possible 83,84 . By using this information to make a treatment plan, the failure rate of dental implant placements could be reduced. The secondary stability of an implant is determined by the degree of apposition between the bone and the implant, which is affected by the macro-or micro-architecture of the surrounding bone after implant placement 85 . Therefore, QCBCT images can be helpful in indirectly and easily assessing the secondary stability of implants by accurately measuring the degree of osseointegration at the implant/bone interface after implant surgery. For example, in the case of complications such as a sharp decrease in bone density around the implant in a QCBCT image obtained after surgery, a rapid and precise clinical response will be possible.
By using QCBCT images, we obtained BMD values with improved quantitative accuracy, linearity, and uniformity at the bone regions of a potential implant site across the entire FOV, compared with conventional CBCT images. Overall, the use of QCBCT images can provide more accurate BMD measurements over a larger FOV than conventional CBCT images, making it a valuable tool for assessing bone quality and density in a variety of clinical settings. Further research is needed on the use of QCBCT images to assess bone quality and density for diagnosing and treating various human organs and structures.
This study has some limitations. The first concerns the accuracy of registration. It is difficult to obtain identical images from CBCT and MDCT due to differences in the scanning posture and scanning environment when acquiring the datasets, and fine human errors can occur when setting landmarks in both images during the registration process. When testing the model performance by sampling the sites of interest, an inaccurate part of the registration can lead to image-to-image anatomical inconsistencies, resulting in poor evaluation results. The second limitation is the diversity of the datasets. Lack of diversity in datasets causes bias in deep learning models and increases the risk of overfitting 86 . There is a need to train and evaluate deep learning models with various datasets acquired from multiple institutions using multiple types of equipment 87 . The third limitation is the number of patients. It is necessary to increase the number of the patients requiring various dental treatments to demonstrate the performance of the developed method more definitely and enable its use in various clinical environments. Finally, the artifacts of metallic or resin restorations present in the oral cavity of actual patients 50 can degrade the quality of QCBCT images, and lead to inaccurate BMD measurements. However, there was a lack of comprehensive analysis for the potential impact of the artifacts on the measurement results in this study. Therefore, it is important to be aware of the potential impact of decreasing the reliability on the quality of the QCBCT images in patients when interpreting the QCBCT images.

Conclusions
When applied to actual patient CBCT images, quantitative CBCT images for BMD measurement based on deep learning demonstrated high accuracy and reliability at various bone regions around potential implant sites across the entire FOV of CBCT. Therefore, quantitative CBCT-based BMD measurement can be a valuable tool for assessing bone quality and density when diagnosing and treating various human organs and structures by providing high-quality BMD images that can aid in accurate and effective treatment.